30 research outputs found

    Benchmarking High Performance Architectures With Natural Language Processing Algorithms

    Get PDF
    Natural Language Processing algorithms are resource demanding, especially when tuning toinflective language like Polish is needed. The paper presents time and memory requirementsof part of speech tagging and clustering algorithms applied to two corpora of the Polishlanguage. The algorithms are benchmarked on three high performance platforms of differentarchitectures. Additionally sequential versions and OpenMP implementations of clusteringalgorithms were compared

    Comparison of Latent Semantic Analysis and Probabilistic Latent Semantic Analysis for Documents Clustering

    Get PDF
    In this paper we compare usefulness of statistical techniques of dimensionality reduction for improving clustering of documents in Polish. We start with partitional and agglomerative algorithms applied to Vector Space Model. Then we investigate two transformations: Latent Semantic Analysis and Probabilistic Latent Semantic Analysis. The obtained results showed advantage of Latent Semantic Analysis technique over probabilistic model. We also analyse time and memory consumption aspects of these transformations and present runtime details for IBM BladeCenter HS21 machine

    A Case Study of Algorithms for Morphosyntactic Tagging of Polish Language

    Get PDF
    The paper presents an evaluation of several part-of-speech taggers, representing main tagging algorithms, applied to corpus of frequency dictionary of the contemporary Polish language. We report our results considering two tagging schemes: IPI PAN positional tagset and its simplified version. Tagging accuracy is calculated for different training sets and takes into account many subcategories (accuracy on known and unknown tokens, word segments, sentences etc.) The comparison of results with other inflecting and analytic languages is done. Performance aspects (time demands) of used tagging tools are also discussed

    Increasing Quality of the Corpus of Frequency Dictionary of Contemporary Polish for Morphosyntactic Tagging of the Polish Language

    Get PDF
    The paper is devoted to the issue of correction of the erroneous and ambiguous corpus of Frequency Dictionary of Contemporary Polish (FDCP) and its application to morphosyntactic tagging of the Polish language. Several stages of corpus transformation are presented and baseline part-of-speech tagging algorithms are evaluated, too

    Application of Weighted Voting Taggers to Languages Described with Large Tagsets

    Get PDF
    The paper presents baseline and complex part-of-speech taggers applied to the modified corpus of Frequency Dictionary of Contemporary Polish, annotated with a large tagset. First, the paper examines accuracy of 6 baseline part-of-speech taggers. The main part of the work presents simple weighted voting and complex voting taggers. Special attention is paid to lexical voting methods and issues of ties and fallbacks. TagPair and WPDV voting methods achieve the top accuracy among all considered methods. Error reduction 10.8 % with respect to the best baseline tagger for the large tagset is comparable with other author's results for small tagsets

    Codzienna praktyka kliniczna w ostrych zespo艂ach wie艅cowych bez uniesienia odcinka ST w szpitalach rejonowych - rejestr w Ma艂opolsce

    Get PDF
    Wst臋p: Ostre zespo艂y wie艅cowe bez uniesie艅 odcinka ST (NSTE ACS) rozpoznaje si臋 u ponad po艂owy pacjent贸w przyjmowanych do szpitali z powodu ostrego zespo艂u wie艅cowego (ACS). Ze wzgl臋du na ca艂odobow膮 dost臋pno艣膰 pracowni hemodynamicznych w Ma艂opolsce ka偶dy pacjent z NSTE ACS wysokiego ryzyka przyj臋ty do szpitala rejonowego mo偶e zosta膰 przetransportowany do o艣rodka kardiologii interwencyjnej. Celem badania by艂a ocena cz臋sto艣ci kierowania pacjent贸w z rozpoznaniem NSTE ACS na badanie diagnostyczne t臋tnic wie艅cowych do o艣rodk贸w kardiologii interwencyjnej oraz charakterystyka demograficzna i zastosowana farmakoterapia w tej grupie. Materia艂 i metody: Korzystaj膮c z ankiet, zebrano dane dotycz膮ce 2382 kolejnych pacjent贸w z rozpoznaniem ACS, przyj臋tych do szpitali rejonowych w Ma艂opolsce w okresie od kwietnia 2002 do lutego 2003 r. U 1396 chorych potwierdzono ostatecznie przy wypisie rozpoznanie ostrego zespo艂u wie艅cowego bez uniesienia odcinka ST. Wyniki: U 42% (n = 582) chorych z ko艅cowym rozpoznaniem NSTE ACS stwierdzono podwy偶szone st臋偶enie marker贸w martwicy mi臋艣nia sercowego, takich jak troponina T/I lub CK-MB (CM+), natomiast u 58% (n = 814) pacjent贸w nie odnotowano podwy偶szonego st臋偶enia marker贸w (CM-). W grupie CM+ zaobserwowano wi臋ksz膮 艣miertelno艣膰 wewn膮trzszpitaln膮 ni偶 w grupie CM- (3,3% vs. 0,4%; p = 0,0002). Tylko 17,7% pacjent贸w z ca艂ej grupy CM+ skierowano w trakcie hospitalizacji do o艣rodka kardiologii interwencyjnej w celu wykonania koronarografii i ewentualnego leczenia inwazyjnego. Wraz ze wzrostem ryzyka okre艣lanego w skali TIMI Risk Score zwi臋ksza艂 si臋 odsetek os贸b kierowanych do pracowni hemodynamiki (TIMI Risk Score 0-2 pkt: 14%; 3-4 pkt: 15%; 5-7 pkt: 22%; p = 0,02 dla 3-4 vs. 5-7 oraz p = 0,01 dla 0&#8211;2 vs. 5&#8211;7). Jednocze艣nie obserwowano wzrost 艣miertelno艣ci wewn膮trzszpitalnej w grupie pacjent贸w nieskierowanych na leczenie inwazyjne 0,8% vs. 1,9% vs. 3,5% (p = 0,02 dla 0-2 vs. 5-7) odpowiednio dla grup z ryzykiem wynosz膮cym w skali TIMI Risk Score 0-2 vs. 3-4 vs. 5-7. Pacjenci kierowani na leczenie inwazyjne znamiennie cz臋艣ciej otrzymywali tienopirydyny (68,3% vs. 44,5%; p < 0,0001), blokery receptora IIb/IIIa (1,5% vs. 0,3%; p = 0,04), heparyny (92,7% vs. 85%; p = 0,003) oraz 脽-blokery (88,3% vs. 78,8%; p = 0,002). Wnioski: Pomimo 24-godzinnego dost臋pu do pracowni hemodynamiki tylko niewielki odsetek pacjent贸w z NSTE ACS jest kierowanych ze szpitali rejonowych. U pacjent贸w z grup wysokiego ryzyka wed艂ug klasyfikacji TIMI (TIMI Risk Score 5&#8211;7 pkt, w tym z podwy偶szonymi st臋偶eniami marker贸w martwicy mi臋艣nia sercowego), niekierowanych na leczenie inwazyjne, pomimo stosowanej terapii farmakologicznej rokowanie nadal jest z艂e. (Folia Cardiol. 2005; 12: 21&#8211;31

    Selectivity of major isoquinoline alkaloids from Chelidonium majus towards telomeric G-quadruplex: A study using a transition-FRET (t-FRET) assay

    Get PDF
    Background Natural bioproducts are invaluable resources in drug discovery. Isoquinoline alkaloids of Chelidonium majus constitute a structurally diverse family of natural products that are of great interest, one of them being their selectivity for human telomeric G-quadruplex structure and telomerase inhibition. Methods The study focuses on the mechanism of telomerase inhibition by stabilization of telomeric G-quadruplex structures by berberine, chelerythrine, chelidonine, sanguinarine and papaverine. Telomerase activity and mRNA levels of hTERT were estimated using quantitative telomere repeat amplification protocol (q-TRAP) and qPCR, in MCF-7 cells treated with different groups of alkaloids. The selectivity of the main isoquinoline alkaloids of Chelidonium majus towards telomeric G-quadruplex forming sequences were explored using a sensitive modified thermal FRET-melting measurement in the presence of the complementary oligonucleotide CT22. We assessed and monitored G-quadruplex topologies using circular dichroism (CD) methods, and compared spectra to previously well-characterized motifs, either alone or in the presence of the alkaloids. Molecular modeling was performed to rationalize ligand binding to the G-quadruplex structure. Results The results highlight strong inhibitory effects of chelerythrine, sanguinarine and berberine on telomerase activity, most likely through substrate sequestration. These isoquinoline alkaloids interacted strongly with telomeric sequence G-quadruplex. In comparison, chelidonine and papaverine had no significant interaction with the telomeric quadruplex, while they strongly inhibited telomerase at transcription level of hTERT. Altogether, all of the studied alkaloids showed various levels and mechanisms of telomerase inhibition. Conclusions We report on a comparative study of anti-telomerase activity of the isoquinoline alkaloids of Chelidonium majus. Chelerythrine was most effective in inhibiting telomerase activity by substrate sequesteration through G-quadruplex stabilization. General significance Understanding structural and molecular mechanisms of anti-cancer agents can help in developing new and more potent drugs with fewer side effects. Isoquinolines are the most biologically active agents from Chelidonium majus, which have shown to be telomeric G-quadruplex stabilizers and potent telomerase inhibitors

    Trends in Modern Exception Handling

    No full text
    Exception handling is nowadays a necessary component of error proof information systems. The paper presents overview of techniques and models of exception handling, problems connected with them and potential solutions. The aspects of implementation of propagation mechanisms and exception handling, their effect on semantics and general program efficiency are also taken into account. Presented mechanisms were adopted to modern programming languages. Considering design area, formal methods and formal verification of program properties we can notice exception handling mechanisms are weakly present what makes a field for future research

    Benchmarking high performance architectures with natural language processing algorithms Benchmarking architektur wysokiej wydajno艣ci algorytmami przetwarzania j臋zyka naturalnego/

    No full text
    Tyt. z nag艂贸wka.Bibliogr. s. 30-31.Algorytmy przetwarzania j臋zyka naturalnego maj膮 du偶e zapotrzebowanie na zasoby komputerowe, szczeg贸lnie gdy wymagane jest dostosowanie algorytmu do j臋zyka fleksyjnego jakim jest np. j臋zyk polski. Artyku艂 przedstawia wymagania czasowe i pami臋ciowe algorytm贸w tagowania cz臋艣ciami mowy oraz algorytm贸w klasteryzacji zastosowanych do dw贸ch korpus贸w j臋zyka polskiego. Dokonano benchmarkingu algorytm贸w na trzech platformach wysokiej wydajno艣ci reprezentuj膮cych r贸偶ne architektury. Dodatkowo por贸wnano wersj臋 sekwencyjn膮 oraz implementacje OpenMP algorytm贸w klasteryzacji.Natural Language Processing algorithms are resource demanding, especially when tuning to inflective language like Polish is needed. The paper presents time and memory requirements of part of speech tagging and clustering algorithms applied to two corpora of the Polish language. The algorithms are benchmarked on three high performance platforms of different architectures. Additionally sequential versions and OpenMP implementations of clustering algorithms were compared.Dost臋pny r贸wnie偶 w formie drukowanej.S艁OWA KLUCZOWE: benchmarking, tagowanie cz臋艣ciami mowy, klasteryzacja dokument贸w, przetwarzanie j臋zyka naturalnego. KEYWORDS: benchmarking, part-of-speech tagging, document clustering, natural language processing, high performance architectures
    corecore